This is a tone-marked Sotho pronunciation dictionary developed for the
purpose of text-to-speech synthesis.

The dictionary contains the following information in the Festival
lexicon format (see
http://www.cstr.ed.ac.uk/projects/festival/manual/festival_13.html):
  - Lexical entries with part of speech information.
  - Pronunciations using standard IPA symbols based on the Lwazi I
    phone set (see http://www.meraka.org.za/lwazi/).
  - Syllable breaks with underlying tones high and low denoted H and L
    respectively

The process of development was as follows:
 - Wordlist generated from [1]
 - Pronunciation prediction and syllabification done automatically
   using the Lwazi I resources (phoneset, G2P rules and
   syllabification algorithm)
 - Part of speech and tone labels digitised from [1] by 4 transcribers
   using spreadsheets and quality control protocol where each entry is
   processed by at least two transcribers.
 - Post-processing discarding remaining incomplete entries and entries
   with conflicting information.

The entries in this dictionary do not cover all entries in [1] and
should be considered a work-in-progress. Entries are not guaranteed to
be completely error-free, however reasonable effort has been made to
ensure accuracy within the scope of this project (see above protocol).

Please note, entries in this dictionary do not necessarily represent
complete words and morphological analysis and processing is required
for use in end-systems. Especially in the case of nouns, entries
generally represent the word stem with further information about the
noun class. For more information about noun classes and part of
speech, please consult [1].

[1] J. A. du Plessis, J. G. Gildenhuys, J. J. Moiloa. Bukantswe ya
        maleme-pedi Sesotho-Seafrikanse / Tweetalige woordeboek
        Afrikaans-Suid-Sotho. Via Afrika Beperk, Cape Town. First
        edition, 1974.
